Variance & Standart Deviation Intuition

Variance definition

$$Var(X)=\frac{\sum_{i=1}^{n}(x_i - \mu)^2} {n}$$

Standard deviation definition

$$a = \sqrt{\frac{\sum_{i=1}^{n}(x_i - \mu)^2} {n}}$$

What they both measure:

  • measures that are used to quantify the amount of variation or dispersion of a set of data values. A low value indicates that the data points tend to be close to the mean of the set, while a value indicates that the data points are spread out over a wider range of values
  • they can indicate "distance from the mean" (the amount by which XX tends to deviate from the average value).
  • Unformal Measure the amount of information in the data set. For Example, lets say my data is the height of k people with X variance and want to account their sex with Y variance. The "unexplained variance" is equal to X-Y. If you know everyone's sex, you can make educated predictions of what their height will be. The sex gives you information about their heights. But it doesn't give you all the information, you're still missing X-Y variance.

Why the meaning of the equition

If the goal is to summarise the spread then we need a good method of defining how to measure that spread, a good place to start is to quanity spread using distance.

<img src=></img>

Why having Both

  • Variance:

    • Variances is addative. That means that if you take many samples independently from the same distribution, the variance of their sum grows linearly in the number of samples. Understanding it also allows you to understand covariance and correlation. For instance, if you have some trait like IQ you can perform a twin study and conclude that 75% of the variance is due to genetics. Neatly, this means that 25% of the variance is due to something(s) other than genetics.
    • Easier math for theories
  • Standard deviation:

    • expressed in the same units as the mean is.

Good graph explanations


In [9]:
variance_graphs = [
    'http://www.stepbystep.com/wp-content/uploads/2013/05/Difference-between-Variance-and-Covariance-Variance.jpg',
    'http://blog.fliptop.com/wp-content/uploads/2015/01/Bias_Variance.jpg']

for graph in variance_graphs:
    display(Image(url=graph))